Deep linear neural networks with arbitrary loss: All local minima are global
نویسندگان
چکیده
We consider deep linear networks with arbitrary differentiable loss. We provide a short and elementary proof of the following fact: all local minima are global minima if each hidden layer is wider than either the input or output layer.
منابع مشابه
A Critical View of Global Optimality in Deep Learning
We investigate the loss surface of deep linear and nonlinear neural networks. We show that for deep linear networks with differentiable losses, critical points after the multilinear parameterization inherit the structure of critical points of the underlying loss with linear parameterization. As corollaries we obtain “local minima are global” results that subsume most previous results, while sho...
متن کاملDepth Creates No Bad Local Minima
In deep learning, depth, as well as nonlinearity, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of feedforward deep l...
متن کاملThe Multilinear Structure of ReLU Networks
We study the loss surface of neural networks equipped with a hinge loss criterion and ReLU or leaky ReLU nonlinearities. Any such network defines a piecewise multilinear form in parameter space, and as a consequence, optima of such networks generically occur in non-differentiable regions of parameter space. Any understanding of such networks must therefore carefully take into account their non-...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملGlobal optimality conditions for deep neural networks
We study the error landscape of deep linear and nonlinear neural networks with the squared error loss. Minimizing the loss of a deep linear neural network is a nonconvex problem, and despite recent progress, our understanding of this loss surface is still incomplete. For deep linear networks, we present necessary and sufficient conditions for a critical point of the risk function to be a global...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1712.01473 شماره
صفحات -
تاریخ انتشار 2017